Search CORE

166 research outputs found

A Pan-cancer Somatic Mutation Embedding using Autoencoders

Author: Beauseroy Pierre
Palazzo Martin
Yankilevich Patricio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2019
Field of study

Background: Next generation sequencing instruments are providing new opportunities for comprehensive analyses of cancer genomes. The increasing availability of tumor data allows to research the complexity of cancer disease with machine learning methods. The large available repositories of high dimensional tumor samples characterised with germline and somatic mutation data requires advance computational modelling for data interpretation. In this work, we propose to analyze this complex data with neural network learning, a methodology that made impressive advances in image and natural language processing. Results: Here we present a tumor mutation profile analysis pipeline based on an autoencoder model, which is used to discover better representations of lower dimensionality from large somatic mutation data of 40 different tumor types and subtypes. Kernel learning with hierarchical cluster analysis are used to assess the quality of the learned somatic mutation embedding, on which support vector machine models are used to accurately classify tumor subtypes. Conclusions: The learned latent space maps the original samples in a much lower dimension while keeping the biological signals from the original tumor samples. This pipeline and the resulting embedding allows an easier exploration of the heterogeneity within and across tumor types and to perform an accurate classification of tumor samples in the pan-cancer somatic mutation landscape.Fil: Palazzo, Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigación en Biomedicina de Buenos Aires - Instituto Partner de la Sociedad Max Planck; Argentina. Universidad Tecnológica Nacional; ArgentinaFil: Beauseroy, Pierre. Université de Technologie de Troyes; FranciaFil: Yankilevich, Patricio. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigación en Biomedicina de Buenos Aires - Instituto Partner de la Sociedad Max Planck; Argentin

CONICET Digital

HAL Descartes

Hal-Diderot

Feature extraction and selection using statistical dependence criteria

Author: Beauseroy Pierre
Marx Nicolás
Tomassi Diego
Publication venue
Publication date: 01/09/2016
Field of study

Dimensionality reduction using feature extraction and selection approaches is a common stage of many regression and classification tasks. In recent years there have been significant e orts to reduce the dimension of the feature space without lossing information that is relevant for prediction. This objective can be cast into a conditional independence condition between the response or class labels and the transformed features. Building on this, in this work we use measures of statistical dependence to estimate a lower-dimensional linear subspace of the features that retains the su cient information. Unlike likelihood-based and many momentbased methods, the proposed approach is semi-parametric and does not require model assumptions on the data. A regularized version to achieve simultaneous variable selection is presented too. Experiments with simulated data show that the performance of the proposed method compares favorably to well-known linear dimension reduction techniques.Sociedad Argentina de Informática e Investigación Operativa (SADIO

Feature extraction and selection using statistical dependence criteria

Author: Beauseroy Pierre
Marx Nicolás
Tomassi Diego
Publication venue
Publication date: 22/11/2016
Field of study

Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

Author: Beauseroy Pierre
Grall-Maës Edith
Jrad Nisrine
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2009
Field of study

Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers

Crossref

Directory of Open Access Journals

PubMed Central

HAL Descartes

Hal-Diderot

Learning Kernels from genetic profiles to discriminate tumor subtypes

Author: Beauseroy Pierre
Koile Daniel
Palazzo Martín
Yankilevich Patricio
Publication venue
Publication date: 01/09/2018
Field of study

Our work aims to perform the feature selection step on Multiple Kernel Learning by optimizing the Kernel Target Alignment score. It begins by building feature-wise gaussian kernel functions. Then by a constrained linear combination of the feature-wise kernels, we aim to increase the Kernel Target Alignment to obtain a new optimized custom kernel. The linear combination results in a sparse solution where only few kernels survive to improve KTA and consequently a reduced feature subset is obtained. Reducing considerably the original gene set allow to study deeper the selected genes for clinical purposes. The higher the KTA obtained, the better the feature selection, since we want to build custom kernels to use them for classification purposes later. The final kernel after optimizing the KTA is built by a linear combination of ‘Ki’ kernels, each one associated to a μi coefficient. The μ vector is computed during the optimization process.Sociedad Argentina de Informática e Investigación Operativ

HAL Descartes

Servicio de Difusión de la Creación Intelectual

Hal-Diderot

Learning Multiclass Rules with Class-Selective Rejection and Performance Constraints

Author: Edith Grall-Maes
Nisrine Jrad
Pierre Beauseroy
Publication venue: 'IntechOpen'
Publication date: 01/02/2010
Field of study

International audienc

IntechOpen

HAL Descartes

Hal-Diderot

Learning Kernels from genetic profiles to discriminate tumor subtypes

Author: Beauseroy Pierre
Koile Daniel
Palazzo Martín
Yankilevich Patricio
Publication venue
Publication date: 09/11/2018
Field of study

Servicio de Difusión de la Creación Intelectual

Learning Kernels from genetic profiles to discriminate tumor subtypes

Author: Beauseroy Pierre
Koile Daniel
Palazzo Martín
Yankilevich Patricio
Publication venue
Publication date: 01/09/2018
Field of study

Extraction d'attributs discriminants par optimisation de fonctions paramétrées

Author: BEAUSEROY Pierre
GRALL-MAËS Edith
Publication venue: GRETSI, Groupe d’Etudes du Traitement du Signal et des Images
Publication date: 01/01/2003
Field of study

Une méthode est proposée pour extraire automatiquement des attributs discriminants dans le cas d'un processus décrit à l'aide d'une base d'exemples étiquetés. Les attributs sont sélectionnés, à l'aide de familles de fonctions paramétrées, en déterminant les paramètres optimaux par rapport à un critère de séparabilité des classes. Les fonctions paramétrées choisies mesurent des caractéristiques correspondant aux moments d'ordre 0 ou 1 d'une représentation uni- ou bi-dimensionnelle pondérée. L'aspect continu des fonctions paramétrées permet d'explorer un ensemble infini d'attributs et d'éviter de traiter un problème de complexité combinatoire. Le critère mesurant la séparabilité des classes est basé sur les matrices de dispersion, et permet la sélection conjointe d'attributs. L'élaboration d'un classifieur linéaire, adapté aux attributs extraits est proposé. La méthode est appliquée à des signaux simulés décrits par leur représentation temporelle

I-Revues

HAL Descartes

Classification basée sur l'extraction conjointe d'attributs dans le plan temps-fréquence selon un critère d'information mutuelle

Author: BEAUSEROY Pierre
GRALL-MAËS Edith
Publication venue: GRETSI, Groupe d’Etudes du Traitement du Signal et des Images
Publication date: 01/01/1999
Field of study

- La méthode proposée concerne la classification de signaux basée sur une extraction automatique et conjointe d'attributs, dans le cas de processus non stationnaires uniquement décrits à l'aide d'une base d'exemples étiquetés. Les attributs sont définis par le résultat de transformations appliquées à la distribution du Wigner-Ville du signal à classer. Chaque transformation est sélectionnée au sein d'une famille de transformations paramétrées. Les valeurs des paramètres sont optimisées afin de maximiser l'information discriminante portée conjointement par les attributs, compte tenu d'une base d'apprentissage. L'information discriminante est mesurée à l'aide d'un critère d'information mutuelle, basé sur l'estimation des lois de distribution conjointes des attributs. Afin de prendre en compte toute l'information portée par les attributs et d'assurer une cohérence avec la phase d'extraction, le classifieur utilise également l'estimation des lois de probabilités. Le principe obtenu présente l'intérêt de ne faire aucune hypothèse sur les lois suivies par les attributs conditionnellement à chacune des classes. La méthode a été appliquée à un problème de classification de signaux de l'électroencéphalogramme du sommeil. De bonnes performances ont été obtenues à partir de l'extraction conjointe de deux attributs

I-Revues

HAL Descartes

Hal-Diderot